** Data reading and variable selection from raw data
** Netherland Kinship Panel Survey (wave 1) 2002


** 01. Reading data **
cap log close
clear all
set more off
cd /*insert you work directory here*/
use /*read your data here*/

** 02. Consructing year and country variables **

ge year=2002
lab var year "survey year"

ge country=528
lab var country "ISO country code"
//Netherlands: 528 (see "ISO Country Codes.pdf) 


** 03. ID variables **

//family identification number: no individual ID number, but each case was assigned with a unique number
ge fid=famnum
lab var fid "person/family id"

ge pid=_n
lab var pid "person id"

** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=asex
lab var sex "sex"
lab def sex 0 "male" 1 "female"
lab val sex sex

ge age=aage
lab var age "age"

ge birthyr=year-age
lab var birthyr "year of birth"


** 05. Siblings **

ge nbro=0
ge nsis=0
forvalue i=1/9 {
replace nbro=nbro+1 if ax201s`i'==1
replace nsis=nsis+1 if ax201s`i'==2
}
forvalue i=1/9 {
replace nbro=nbro+1 if ax201t`i'==1
replace nsis=nsis+1 if ax201t`i'==2
}
forvalue i=1/4 {
replace nbro=nbro+1 if ax201u`i'==1
replace nsis=nsis+1 if ax201u`i'==2
}
ge nsibs=nbro+nsis

ge birthorder=1
forvalues i=1/9 {
replace birthorder=birthorder+1 if ax301s`i'<birthyr
}
forvalues i=1/9 {
replace birthorder=birthorder+1 if ax301t`i'<birthyr
}
forvalues i=1/4 {
replace birthorder=birthorder+1 if ax301u`i'<birthyr
}
lab var nbro "number of brothers"
lab var nsis "number of sisters"
lab var nsibs "number of siblings"
lab var birthorder "birth order"


** 06. Own education **

*highest education obtained, not just attended
rename aedu educ
lab var educ "highest education obtained"


** 07. Parents' education: Father and/or Mother **

//highest education obtained
rename ab505 faeduc
lab var faeduc "father's highest education obtained"

rename ab510 moeduc 
lab var moeduc "mother's highest education obtained"

** 08. Own occupation **

ge firstocc_ISEI=am403c_s
lab var firstocc_ISEI "first job ISEI code"
ge lastocc_ISEI=am409d_s
lab var lastocc_ISEI "last job ISEI code"
ge occ_ISEI=am301c_s
lab var occ_ISEI "current job ISEI code"
ge occ_CBS=am301c
lab var occ_CBS "CBS classification of current job"

rename am302 ind
lab var ind "type of business of current job (coded in Dutch)"
rename am308 supocc
lab var supocc "supervision in current job"


** 09. Parents' occupation **

//parents occupation when respondent 15
ge faocc15_CBS=ab502a
lab var faocc15_CBS "father's job (CBS classification) when respondent 15"
lab val faocc15_CBS occ_CBS
ge faocc15_ISEI=ab502a_s
lab var faocc15_ISEI "father's job (ISEI code) when respondent 15"
lab val faocc15_ISEI occ_ISEI

ge moocc15_CBS=ab507a
lab var moocc15_CBS "mother's job (CBS classification) when respondent 15"
lab val moocc15_CBS occ_CBS
ge moocc15_ISEI=ab507a_s
lab var moocc15_ISEI "mother's job (ISEI code) when respondent 15"
lab val moocc15_ISEI occ_ISEI

//parents current job
rename ab503 fawork
lab var fawork "father's type of employment"

rename ab508 mowork
lab var mowork "mother's type of employment"

** 10. Tabulate the Identified Variables **

log using /*insert you work directory here*/, replace text

** Data reading and variable selection from raw data
** Netherland Kinship Panel Survey (wave 1) 2002

** Sex **
tab sex

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs, d

** R's Own Education **
tab1 educ 

** Parental Education **
tab1 faeduc moeduc 

** R's Own Occupation **
tab1 firstocc_ISEI lastocc_ISEI occ_ISEI occ_CBS ind supocc

** Parental Occupation **
tab1 faocc15_CBS moocc15_CBS faocc15_ISEI moocc15_ISEI fawork mowork


log close

** 11. Keep the identified variables only

keep year country fid sex age birthyr ///
	 nsibs educ faeduc moeduc ///
	 firstocc_ISEI lastocc_ISEI occ_ISEI occ_CBS ind supocc ///
	 faocc15_CBS moocc15_CBS faocc15_ISEI moocc15_ISEI fawork mowork


** 12. Save the Data File **

saveold /*insert you work directory here*/, replace



** 13. Homoginising education **
** Own Education **
rename educ educ_cat

ge educ_yrs=5 if educ_cat==1
replace educ_yrs=7 if educ_cat==2
replace educ_yrs=9 if educ_cat==3
replace educ_yrs=10 if educ_cat==4
replace educ_yrs=11 if educ_cat==5
replace educ_yrs=12 if educ_cat==6
replace educ_yrs=10.5 if educ_cat==7
replace educ_yrs=15 if educ_cat==8
replace educ_yrs=17 if educ_cat==9
replace educ_yrs=20 if educ_cat==10
lab var educ_yrs "respondent highest education in years"

ge educ_ISCED=020 if educ_cat==1
replace educ_ISCED=100 if educ_cat==2
replace educ_ISCED=254 if educ_cat==3
replace educ_ISCED=244 if educ_cat==4
replace educ_ISCED=344 if educ_cat==5
replace educ_ISCED=344 if educ_cat==6
replace educ_ISCED=354 if educ_cat==7
replace educ_ISCED=655 if educ_cat==8
replace educ_ISCED=600 if educ_cat==9
replace educ_ISCED=747 if educ_cat==10
lab var educ_ISCED "respondent highest education in ISCED code"


** Parents Education **
//father's education is actually father's
ge faeduc_flag=1 

rename faeduc faeduc_cat
rename moeduc maeduc_cat

ge faeduc_yrs=5 if faeduc_cat==1
replace faeduc_yrs=7 if faeduc_cat==2
replace faeduc_yrs=9 if faeduc_cat==3
replace faeduc_yrs=10 if faeduc_cat==4
replace faeduc_yrs=11 if faeduc_cat==5
replace faeduc_yrs=12 if faeduc_cat==6
replace faeduc_yrs=10.5 if faeduc_cat==7
replace faeduc_yrs=15 if faeduc_cat==8
replace faeduc_yrs=17 if faeduc_cat==9
replace faeduc_yrs=20 if faeduc_cat==10
replace faeduc_yrs=. if faeduc_cat==11
lab var faeduc_yrs "father's education in years"

ge faeduc_ISCED=020 if faeduc_cat==1
replace faeduc_ISCED=100 if faeduc_cat==2
replace faeduc_ISCED=254 if faeduc_cat==3
replace faeduc_ISCED=244 if faeduc_cat==4
replace faeduc_ISCED=344 if faeduc_cat==5
replace faeduc_ISCED=344 if faeduc_cat==6
replace faeduc_ISCED=354 if faeduc_cat==7
replace faeduc_ISCED=655 if faeduc_cat==8
replace faeduc_ISCED=600 if faeduc_cat==9
replace faeduc_ISCED=747 if faeduc_cat==10
replace faeduc_ISCED=. if faeduc_cat==11
lab var faeduc_ISCED "father highest education in ISCED code"

ge maeduc_yrs=5 if maeduc_cat==1
replace maeduc_yrs=7 if maeduc_cat==2
replace maeduc_yrs=9 if maeduc_cat==3
replace maeduc_yrs=10 if maeduc_cat==4
replace maeduc_yrs=11 if maeduc_cat==5
replace maeduc_yrs=12 if maeduc_cat==6
replace maeduc_yrs=10.5 if maeduc_cat==7
replace maeduc_yrs=15 if maeduc_cat==8
replace maeduc_yrs=17 if maeduc_cat==9
replace maeduc_yrs=20 if maeduc_cat==10
replace maeduc_yrs=. if maeduc_cat==11
lab var maeduc_yrs "mother's education in years"

ge maeduc_ISCED=020 if maeduc_cat==1
replace maeduc_ISCED=100 if maeduc_cat==2
replace maeduc_ISCED=254 if maeduc_cat==3
replace maeduc_ISCED=244 if maeduc_cat==4
replace maeduc_ISCED=344 if maeduc_cat==5
replace maeduc_ISCED=344 if maeduc_cat==6
replace maeduc_ISCED=354 if maeduc_cat==7
replace maeduc_ISCED=655 if maeduc_cat==8
replace maeduc_ISCED=600 if maeduc_cat==9
replace maeduc_ISCED=747 if maeduc_cat==10
replace maeduc_ISCED=. if maeduc_cat==11
lab var maeduc_ISCED "mother highest education in ISCED code"

** 14. Homoginising sibling **
//cutoff
ge nsibs_flag=99
lab var nsibs_flag "cutoff of total number of siblings"

lab def nsib_flag 99 "no cutoff"
lab val nsibs_flag nsib_flag

//number of brothers/sisters not available


** 15. Tab Education and Sibling Variables **
tab1 sex age birthyr
tab1 educ_cat educ_yrs faeduc_cat faeduc_yrs maeduc_cat maeduc_yrs faeduc_flag 
tab1 nsibs nsibs_flag


** 16. Save the Data File **

saveold /*insert you work directory here*/, replace
